-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Typ parts of python parser #45015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typ parts of python parser #45015
Conversation
phofl
commented
Dec 22, 2021
- Ensure all linting tests pass, see here for how to run them
# Conflicts: # pandas/io/parsers/c_parser_wrapper.py # pandas/io/parsers/python_parser.py
looks fine can you rebase again |
Done |
except StopIteration: | ||
self.buffer = None | ||
line = next(self.f) | ||
line = next(self.f) # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twoertwein Does ReadCsvBuffer suport next in addition to iter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand: As long as .__iter__()
returns an object that has __next__
, I don't think the file object itself has to have __next__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm weird, mypy complains about ReadCsvBuffer[str]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, you are right! I think only read_fwf
calls next explicitly, or? read_csv
does not explicitly call next
. The following works with read_csv
from io import StringIO
import pandas as pd
class Test:
def __init__(self):
self.buffer = StringIO("a,b,c\n0,1,2")
@property
def mode(self):
return self.buffer.mode
def fileno(self):
return self.buffer.fileno()
def seek(self, __offset, __whence=-1):
if __whence != -1:
return self.seek(__offset)
return self.seek(__offset, __whence=__whence)
def seekable(self):
return self.buffer.seekable()
def tell(self):
return self.buffer.tell()
def read(self, __n=None):
self.buffer.read(__n)
def __iter__(self):
return self.buffer.__iter__()
def readline(self):
return self.buffer.readline()
@property
def closed(self):
return self.buffer.closed()
print(pd.read_csv(Test(), engine="python"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I think this is only called from read_fwf.
Can we add next to ReadCsvBuffer without breaking anything (from a correctness standpoint)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be fine with adding it or (probably better) creating ReadFwfBuffer
which inherits from ReadCsvBuffer
(not sure whether it needs everything that is in ReadCsvBuffer
).
Also happy to have the ignores for now and address this in a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets go with the ignore for now then until we have typed more. Can probably make a better decision then
Thanks @phofl |
* TYP: Type python parser * Fix bug * Fix assignment issue * Adress conflicts * Remove unnecessary changes * Adjust
* TYP: Type python parser * Fix bug * Fix assignment issue * Adress conflicts * Remove unnecessary changes * Adjust